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Calibrating Array Detectors 

D. J. Fixsen 1 ' 3 , S. H. Moseley 2 , and R. G. Arendt 1 

ABSTRACT 

The development of sensitive large format imaging arrays for the infrared promises 
to provide revolutionary capabilities for space astronomy. For example, the Infrared 
Array Camera (IRAC) on SIRTF will use four 256 x 256 arrays to provide background 
limited high spatial resolution images of the sky in the 3 to 8 /im spectral region. 
In order to reach the performance limits possible with this generation of sensitive 
detectors, calibration procedures must be developed so that uncertainties in detector 
calibration will always be dominated by photon statistics from the dark sky as a 
major system noise source. In the near infrared, where the faint extragalactic sky is 
observed through the scattered and reemitted zodiacal light from our solar system, 
calibration is particularly important. Faint sources must be detected on this brighter 
local foreground. 

We present a procedure for calibrating imaging systems and analyzing such 
data. In our approach, by proper choice of observing strategy, information about 
detector parameters is encoded in the sky measurements. Proper analysis allows us to 
simultaneously solve for sky brightness and detector parameters, and provides accurate 
formal error estimates. 

This approach allows us to extract the calibration from the observations themselves; 
little or no additional information is necessary to allow full interpretation of the data. 
Further, this approach allows refinement and verification of detector parameters during 
the mission, and thus does not depend on a priori knowledge of the system or ground 
calibration for interpretation of images. 



1. Introduction 



The Infrared Array Camera (IRAC) (Fazio et al. 1998) will employ four 256 x 256 imaging 
infrared arrays and the cooled telescope of the SIRTF to produce images of the sky which are 
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limited by the photon statistics from the natural background, which, in this spectral region (8-25 
/im), is dominated by scattered and emitted light from the zodiacal dust particles. This will be 
typical of future applications of infrared detectors in space. In order to produce high quality 
images in the presence of this strong background, the relative response of the different pixels 
in the detector array must be known to high precision. A technique must be developed that 
allows the detector properties to be determined in operation, so that the requisite stability can be 
experimentally verified, and changes in response can be measured and included in the analysis of 
the data. We present a technique by which the detector properties are determined simultaneously 
with the estimates of sky brightness, and formal errors developed for both instrument and sky 
parameters. 

We observe the same area of the sky with the detector array at a number of spatially offset 
positions. These observations are used to set up a system of linear equations involving both sky 
brightness and detector properties. In solving this system of equations, we can deduce the sky 
brightness and detector gain and offset parameters. By appropriate choices of offset spacings and 
sky brightness distributions, this technique allows us to continuously improve our knowledge of 
the detector properties or detect changes. This approach embeds the relative calibration of the 
detector array into the survey process; all information required to produce an internally consistent 
survey can deduced from the survey itself. Since the data on which the calibration is based is 
the survey itself, it is the way to calibrate the data which is, in some sense, least susceptible to 
systematic errors. In the case that an a priori calibration is used, this technique offers a method 
to test internal consistency. 

In this paper, we describe this least squares solution for sky and detector properties, and 
suggest implementations of the technique for the IRAC instrument. We present the analysis of 
synthetic Wide-Field Infrared Explorer (WIRE) data and real Hubble NICMOS data, in which we 
derive the sky brightness, detector gain and detector offset. (We had planned a demonstration of 
the technique on the Wide-field Infrared Explorer, but its unfortunate demise renders the point 
moot.) The results are encouraging, and form the basis of our plans for the analysis of the IRAC 
imaging data. Optimization of the observational strategy to produce the best encoding of the 
detector parameters in the survey observations is treated in a separate paper (Arendt, Fixsen & 
Moseley 2000). This approach can offer significant insurance to the observer, in that regardless 
of the availability or applicability of independent relative calibration data for the instrument, 
sufficient information is present in the observations themselves to allow the relative calibration of 
the data. This provides the capability for the observer to validate the statistical properties of the 
data or to calibrate it as required. 

Least squares techniques are an important staple of model fitting. In this paper, we use a 
least squares technique, combined with sampling over a wide range of spatial scales, to produce an 
intensity calibration for the imaging system. Investigators have long used "sky flats" to produce 
estimates of system response (e.g. Joyce, 1992). In this process, images are taken at a variety of 
positions around the object of interest. These images are often processed using median filtering to 
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produce estimates of detector response. In this paper , we derive the full algorithm for optimal use 
of the sampled data for intensity calibration of an imaging detector. This algorithm then allows us 
to ask more sophisticated questions important for planning observations, such as comparing the 
relative goodness of different sampling procedures (Arendt et al. 1999). This algorithm provides 
the optimal tool for calibrating imaging detectors; if the algorithm does not produce reliable 
results, it is indicative of incompleteness in the sampling of the sky. The algorithm provides an 
optimal detector calibration based on the data provided it. If a priori information about the 
detector is known, the algorithm can be adjusted to include it. 

Other algorithms have been described in the literature for analyzing dithered image data. The 
drizzle method (Fruchter and Hook, 1998) is an approach for combining undersampled dithered 
images to produce a single combined image with improved resolution and signal-to-noise ratio. 
However, this technique is a means of a producing final image from calibrated data, and is not 
intended as a method of deriving the detector calibration. 

Future observatories will generate survey data. The accuracy of analysis of these data will 
depend on a clear understanding of the statistical properties of the uncertainties in the data, their 
level, and spatial and temporal correlations. We present an approach for the analysis of such data, 
with specific application to the imaging data from the SIRTF IRAC instrument. 

This comprehensive least squares approach has been successfully applied to the analysis of the 
data from the FIRAS instrument on COBE, in which a complex instrument model was required 
(Fixsen et al.1994). 

2. Overview 

The following equations show the derivation of the simultaneous extraction of sky brightness 
and instrument parameters to the data. The advantages of this system are: 1) It uses the same 
data for calibration and observation which saves separate observation time for calibration and 
uses the same time and exactly the same conditions for calibration and observation. 2) It uses a 
well understood process for calibration allowing for complete error analysis and flexible response 
in the case that unexpected errors arise. 3) It explicitly includes the uncertainties and correlations 
introduced in the calibration process in the uncertainties of the resulting data. We focus on an 
imaging array observing sections of the sky, but the derivation is either directly applicable or 
easily generalized to other problems. 

The underlying process is a simple linear fit which is easily understood, although the matrices 
involved are unwieldy. The inverses of the matrices are assumed to exist. If there are problems 
inverting these matrices, it is an indication that information is missing in the calibration process. 
We do not go into detail about the convergence or singularities of the process, but these need to be 
addressed as they show key weaknesses in the calibration process and can generally be corrected 
by improving the measurement strategy (Arendt et al. 2000). 
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Since the details of the calibration process leave their impact on the noise characteristics of 
the final data set, the procedure for taking data must be be carefully designed. This is not unique 
to this particular process for calibration, but this procedure makes the costs of poor measurement 
strategies obvious. 

3. Derivation of the Algorithm 

We follow the Einstein summation convention and use different indices for the different 
vector spaces. Latin indices are used for the raw data and instrument pixels while greek indices 
are used for the derived solution and the sky pixels. We use the same variable names for the 
contravariant and covariant cases even though the numerical values are different, because the 
underlying information is the same (see Table 1). 

Consider the general solution, where we have a model of the data, H l (9^), where 6> M is a 
vector of parameters which includes both detector and sky parameters. First we linearize the 
equation, about a point M at or near the solution yielding: 

H i (0> i )KiH i (& i )+Hl6i i , (1) 

where H l = dH l /d0^. The derivatives are performed at and 6^ are perturbations from 0^ 

Given a data set D l we define A 1 = D l — H l (®^). With a symmetric weight matrix, Wij, x 2 
is calculated as 

X 2 = {X - H^W^ti - Hi8 v ) (2) 
and its minimum is determined by 




-WvWijiA* - Hi?) - (A* - m^WijHi 



= -2HiWijA* + 2HiW ij Hi6 v = 0. (3) 
Thus the solution for 5^ can be expressed as 

6" = (H^WijHiy'H^WkA 1 = {HlH %v )- l H k u A k . (4) 

There are several potential pitfalls here particularly if the second derivative, H l = dH^/dO", 
is ill-behaved in the region of interest. If H l 5^S U > 1 the expansion point is too far from the 
solution. A new B^ 1 closer to the solution should be used. If H*" (H i Hi V )~ l is close to 1 or larger 
a full differential geometric treatment is in order which is beyond the scope of this paper. 

The inversion of the matrix H^H^ is the hard part of the problem. In what follows we show 
how properties of this matrix that frequently exist can reduce the problem to one that can be 
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computed on a modest computer. The inverse of the matrix is also the covariance matrix of the 
parameters including the sky parameters. 

It is also interesting that: 

^ = #;a,. (5) 

This is a simple mnemonic to remember the solution. It also shows that the covariant form of the 
solution on the left is like the covariant form of the data on the right. This is the weighted form 
of the solution needed if one desires to fit this solution to some higher level theory. This can be 
done even if the matrix cannot be inverted. 

To develop a more tractable form of equation (4), we separate the detector parameters from 
the sky parameters. 

6 li = (X 1 ...X v ,6S 1 ...8S T ). (6) 

The parameters are not required to have the same units; the weight matrix has all of the 
appropriate inverse units. Analogously the parameter weight matrix is separated into 3 parts, 




HlWyHl = I rp I . (7) 



The part dealing with the instrument is A = H^WijH-J,. The part dealing with the resulting 
sky map is C = H^WijH^. And the connections between them are B = H % a WijH 3 q . The covariance 
matrix (inverse of the weight matrix) is broken into the same sorts of parts. Often, each detector 
observes only one sky pixel at a time and the weight matrix is simple enough that the large 
submatrix, C = H l a WijHp can be easily stored and inverted. Let us then consider 




(fljWyffj)" 1 = (flj H*)- 1 = * . (8) 



The inverse or covariance can be calculated by: 

Q = (A- BC~ 1 B T )- 1 (9) 
R = -QBC~ 1 (10) 

and 

* = c- 1 + C~ 1 B T QBC- 1 . (11) 

When the only interest is in the uncertainties in the array parameters, (e.g. when the 
calibration is used for other data) only Q is needed. Similarly, if only the sky uncertainties are 
required, only ^ is needed. 

The covariance of the derived sky, is composed of two parts. The C _1 is the direct 
propagation of the measurement errors to the sky. The other part C~ 1 B T QBC~ 1 shows the 
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additional uncertainty due to the calibration. For a well chosen set of observations this part can 
approach (V/PM)C~ 1 , the limit set by the number statistics. 

The matrix, Q, is much smaller than H^H^, but still may be inconveniently large. Equation 
(4) is really a system of linear equations. By substituting equation (8) into equation (4) and 
retaining only the first V equations we have: 

X = {QH l q + RH l a )A t = QY (12) 

where 

Y = H* q Ai - BC^HiAi. (13) 

The matrix A relates the detector parameters to each other. With care these can be chosen 
so that the matrix can be inverted. With the size and speed of modern computers this is can even 
be accomplished with brute force techniques. In many cases A will be a multiple of a kernel which 
is the result of a single observation. 

Now to get a form of equation (12) suitable for computing, let 

T = A-^BC' 1 ' 2 = {HlH ir T l l 2 HlH ]a {H k a H kp y l l 2 . (14) 

Then 

X = {A-BC~ 1 B T )- 1 Y = A- T l 2 {I-A- 1 ' 2 BC- 1 B T A- T l 2 )- 1 A- l l 2 Y = A~ T I 2 {1—TT T )~ 1 A~ 1 I 2 Y. 

(15) 

Like B, the size of T is V x T, but it is sparse. 

Finally, we use (I — TT T ) _1 = 2^nLo(-^-* )™ ^° S e t a form that is tractable with a modest 
computer. Although formally the sum must be carried to infinity the sum converges in tens to 
hundreds of iterations for well chosen observations. Then, 



X = QY = A- T/2 



Ln=0 



T\n 



A- l ' 2 Y. (16) 



The matrix, TT T , is avoided by defining Zq = A l / 2 Y ', and iterating 

Z n+l = Z + T(T T Z n ) (17) 

until Z is stable. It is trivial then to get the solution X = A~ T I 2 Z . This is only the solution for 
the detector, but the solution for the sky is then straight forward. 



4. Example 

Next we show how the algorithm is used in a practical program. Some of the key details are 
given in the appendix, here we outline the steps of the program and relate them to the previous 
derivation. 
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Table 1. Variable Definitions 



Variable 


Definition 


P 


number of Array pixels, e.g. 256 x 256 = 65536 


V 


number of detector parameters, e.g. 2P = 131072 


M 


number of images in the data set, e.g. 100 


i,j,k 


are indices to data € (1 . . . P x M) 


D l 


data 


A i 


model error 


V i 


data variance (assumed to be diagonal) 


r 


number of observed sky locations, e.g. 500000 


Q, (3 


indices to sky locations e(l...r),r<Px M) 


S a 


set of sky parameters 


V 


index to pixels <G (1 . . . P) 


G p 


set of gain parameters 


pp 


set of offset parameters 


q,r 


indices to detector parameters E (1 . . . V) 


x« 


set of detector parameters (SF P ,5G P ) 


fl, v, to 


indices to all parameters € (1 . . . V + Y) 
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We adopt a simple model for the data, but more complex models are as easily handled as long 
as they are relatively linear in the range of interest, do not require large numbers of parameters 
to be determined, and are not undetermined. A formal derivative must be calculated for each 
of the extra parameters and coded into the algorithm, while this may be messy and clutter the 
program, small numbers of parameters (e.g. temperature effects) that affect the entire array make 
only small changes to the required time or the final accuracy of the algorithm. If some part of the 
parameter space is undetermined the program may not converge. 

Our example has a separate gain, G, and offset, F, for each detector that modify the sky 
intensity, S, as it is detected. The model, H l for the data is given by 

H i (G p , S a , F p ) = G p S a + F p . (18) 

X = (SG 1 . . . 5G P , SF 1 . . . SF P ) (19) 

The example is obviously nonlinear and we must be careful to chose an initial point close enough 
to the solution for the algorithm to converge to the solution. For a particular detector array one 
would use the algorithm many times so one can use the last solution as the starting point and 
either add more data to improve the solution or find a new solution with new data. Either way, 
only once, do we need to start without a previous solution. In that case we can let G p = 1, F p = 0, 
and V p = J2ie P (D % — F p ) 2 /M. Then with the assumption that the uncertainties are a function of 
pixel only, we have an estimate for V 1 . We will return to this estimation in section 6. 

We assume a diagonal weight matrix Wu = {V l )~ l to keep the example simple. However, we 
emphasize that this is not required. The derivation is completely general and can accommodate 
a nondiagonal weight matrix. Note that this assumption does not mean that the data are 
uncorrelated. Indeed, the data are correlated as some of the data are derived from the same pixel 
or are observations of the same part of the sky with different detectors. If there are other sources 
of correlation (such as detector temperature) they need to be explicitly included in the model. 
The assumption here is that the residual errors are uncorrelated. 

The first step of the program is to calculate 

Aj = Wiip* - G p S a - F p ). (20) 



As there are two types of parameters we divide the matrix A into its four quadrants for 
discussion. 

A m 

Each of the submatrices of A is diagonal, including the part relating the gain and offset of 
each pixel. The whole matrix is treated as P 2 x 2 matrices. There is not a unique A -1 / 2 , 
mathematically the choice is arbitrary, but the symmetric choice and the choice where the lower 
left are zero are easier to program. We have used both and found the nonsymmetric version is less 
susceptible to numerical instability. 
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Although the size of H l is PM x (V + T) it can be treated as a set of delta functions. With 
care in the processing, the parts of H that are zero need never be accessed (appendix). There are 
3P x M nonzero parts. That is each datum appears 3 times, once associated with G, F and S. 

d GP W = S a 5 pi 

The second step makes use of the following relations: d FP H l = 5 p i to construct 

ds°IP = GP5 ai 

diag Ag= Y S a WiS a , diag A F = Y W i ( 22 ) 

i&p,i&a idp 

diag A FG = Y, S " W i ( 23 ) 

i£p,i£a 

and 

diag C= Y G p WiG p . (24) 

iea,iep 

C is diagonal as well. 

The matrix B is divided into two parts similar to A: 

B G = Y S a WiGP, B F = Y W i° P - ( 25 ) 

Finally Y has two parts 

Y G = Y S a A i -B G C- 1 Y ° PA ^ Y F = Y^-B F C- 1 Y GPA i- ( 26 ) 

Note that B is 2P x T but it is sparse. We then calculate T = A^BC' 1 / 2 . With the 
elements of equation (16), the program iterates equation (17) until Z is stable. Then the solution 
X = A~ T I 2 Z. This is only the solution for the detector, but the solution for the sky is then: 

S a = Y [{D l - F p )G p Wi]/ Y ( GP f w i- ( 27 ) 



This then is a form which can be handled by a modest computer. The vectors X and Y 
are each only 2P = 131072 long. The matrix A is stored as three P long diagonal parts of its 
submatrices. The matrix T is nominally large (2P x T) but is sparse and has at most 2P x M 
nonzero components. 

At this point there are two obvious singularities. These correspond to the uniform change in 
the sky brightness and a cancelling change in the offset, and to a multiplication of the sky by an 
arbitrary amount and a cancelling effect in the gain term. These two singularities point out what 
we already know; in order to get an absolute calibration we need an absolute standard. There are 
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several ways to deal with this issue: 1) An absolute calibration could be done in the laboratory. 2) 
Certain places on the sky could be determined in some other way and used to impose a condition 
that would break the singularity. 3) A map could be produced with an arbitrary gain and offset. 

The three methods are not mutually exclusive. A map with arbitrary gain and offset can be 
produced which is subsequently calibrated by laboratory measurements or sky measurements or 
a combination of sky and laboratory measurements. The absolute calibration can be included in 
the fit or applied later. We choose to apply it separately as this maintains the uniformity of the 
algorithm whether viewing a calibration object or not. 

Without treating the singularity, the sum in equation (16) does not converge. If there 
are no dark frames to determine the offset, after each iteration we impose the condition that 
J2 P \Jj2ie p Wi 5F P = 0. The weight applied to the 5F P is only for computational convenience (it 
is the form of F p in Z). The key is that the net offset is not allowed to change. If dark frames are 
present we can use them to determine the offset and do not impose this condition. Similarly, a 
weighted mean gain is held fixed, J2 P \JJ2iep,iea S a WiS a 5G P = 0. 

This completes the solution for the detector and the sky. The calculation of a single uncertainty 
vector is completely analogous. However the full covariance matrix ^ is ~ 500000 x 500000. 
This matrix is symmetric but it is not sparse. In fact it is likely that all of the elements are 
nonzero. The 2.5 x 10 11 components of ^ are awkward to carry around but they contain all of 
the information about the correlations imposed by the calibration process. It can be stored more 
compactly by keeping T, and noting that 

^ = C~ 1/2 {I - T T T)- l C- 1/2 (28) 

since T is sparse and C -1 ' 2 is diagonal. 

Now we return to the issue of variance (weight) estimation. Without a model for the noise we 
have a hopeless task. However with a simple model we can estimate the variance. An unbiased, 
but poor, estimation only increases the noise (and the estimate of the uncertainty). 

In the model program we assume three sources of error: l)Poisson statistics, 2)A pixel 
dependent readout noise, 3)A cosmic ray induced error. The Poisson noise is easily calculated 
if the approximate gain of the system is known. The readout noise is best estimated by using 
the RMS of all of the data from that pixel (except the cosmic ray contaminated data). Cosmic 
rays are identified by seeking large discrepancies. These should not be used in either the sky or 
variance estimation. Obviously as data are collected a more detailed model can be developed. 

After a solution is found, the model program recalculates A*. Data with errors greater than 
2.5 a are assumed to be cosmic particle hits or other glitches. These data are marked and not 
used in the next iteration. The remaining A*s are squared and summed to estimate the noise. The 
model program noise is treated separately for each pixel. If hundreds or thousands of pictures are 
available this process could potentially identify subtle problems with particular pixels. If fewer 
data are available a smooth approximation over entire detector array is more appropriate. 
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5. Practical Matters 

The algorithm described in the preceding sections can produce mosaics of large regions 
provided that at least some parts of the region (preferably all parts) contain repeated (dithered) 
observations. The algorithm can be applied to a data set containing spatially separate regions. 
There is no constraint on the size or geometry of the region(s) in the data set. It is only required 
that the detector gain and offset and the sky intensity (G p , F p , and S a ) are constant for the 
entire data set. These restrictions can be relaxed by explicitly parameterizing known or suspected 
variations. 

If dark frames are available, they are added to the data set as if they are observations of a 
region of sky that is separate from the rest of the data and that has an intensity S a = 0. The 
addition of dark frames to the data set allows the algorithm to determine the offset components. 

The algorithm can be implemented in a general manner, such that the detector dimensions 
and number of frames processed are adjustable. A general code can be applied to different data 
sets from different instruments if a new "front end" is written for each type of data to ingest the 
data and provide the necessary initial estimates and control parameters. 

The selection of the weights (Wo) to use in the algorithm can be important. Poor weighting 
of the data may cause spurious features to propagate through the solutions for G p , F p , and S a . 
Cosmic ray hits on the detector also cause spurious features in the results, if not properly handled. 
Data affected by cosmic rays can be given very low weights or flagged. It is best if the effects of 
cosmic rays are removed from the data before processing, though this is not always possible. The 
algorithm can recognize cosmic rays as outliers provided that they are not so numerous that they 
sever ly bias the results. 

In most cases, the algorithm will be used iteratively for 2 - 5 cycles. Subsequent iterations 
use the previously derived gain and offset values as inputs, and make use of successively improved 
weights and exclusions of cosmic rays as well. 

An IDL implementation of this algorithm requires free memory ~15 times larger than the size 
of the data set to be processed. For a data set of 27 256x256 images the algorithm takes ~450 
seconds of CPU time on a 300 MHz Pentium II machine running Red Hat Linux 5.2 and IDL 5.0.3. 
About 270 seconds of that CPU time is spent in the calculation of the summation of equation (16), 
using the iterative step of equation (17) for 100 terms. The key data arangements of the program 
are discussed in the appendix. The time for the procedure is linear with the number of input data 
elements as long as more iterations are not needed. The number of iterations required is strongly 
related to the connection map which is determined by the dither pattern of the input data. 

Solving only for detector gains in cases where the detector offsets are negligible is a minor 
simplification of the algorithm and is a more robust procedure. Figure 1 illustrates the results of 
using this procedure to solve for only the detector gains and sky intensities. The data used is 
from Wide-Field Infrared Explorer (WIRE) simulations. The model for the sky includes point 
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sources, cirrus and a zodiacal background. The model for the 128 x 128 detector array included 
gain variations, bad pixels, and cosmic ray hits. The data set consists of 10 dithered images, 
one of which is shown in the upper left of Figure 1. The detector gain variations dominate the 
qualitative appearance of the data. The derived gain compares favorably with the true gain, with 
the exception of ~ 0.2% of the pixels with remaining artifacts from bad detectors and cosmic rays. 
There is a small (~ 1.0025) scale factor between the derived and true gains, which reflects the lack 
of absolute calibration in the procedure. The derived sky is a good representation of the real sky, 
with the additional noise component indicated by the second term of equation (11). 

Figure 2 illustrates the application of the algorithm to real data, namely the HST NICMOS 
observations of the Hubble Deep Field - South. The raw data used here were 59 good 1152 and 
1472 s integrations. The worst effects of cosmic rays were eliminated by calculating linear fits to 
the multiaccum readouts from each pixel. Fits with poor correlation coefficients were refit using 
a combination of linear and step functions. Additional pre-processing involved subtracting the 
median value of each quadrant of each frame from that frame quadrant. This helped compensate 
for a variable "pedestal" effect which is not modelled by our current algorithm. (The bottom 
16 rows of each frame were ignored in the processing to avoid vignetting effects.) The initial 
gain map was assumed to be flat and unity. The initial offset map was assumed to be flat and 
zero. A dark file from the NICMOS reference files was used for a simulated dark frame that was 
processed simultaneously with the sky data. The derived sky after 2 iterations of the algorithm 
and truncation of the series expansion after 100 terms, is not as clean as the publicly released 
processed data. Spurious large scale structure is present at low levels. A faint stripe along the 
detector columns is visible through the brightest star in the field. The gain and offset maps are 
similar to calibration flat and dark reference files. In our derived gain and offset maps there are 
residual defects in pixels where the bright sources in the map were observed. The gain and offset 
maps also contain visible quadrant errors and vertical bars from "shading" because of instrumental 
effects that are not adequately described by the simple method used here. Clearly there is room for 
improvement, but the algorithm worked well. The process allowed the simultaneous determination 
of sky and detector parameters using only sky measurements and dark frames. By inspecting the 
residuals there are indications that the offsets are not constant from observation to observation. 
This suggests an improved model for the data could be constructed by parameterizing and fitting 
these offsets. 

In the case of IRAC, such an algorithm is essential. With it, we can continuously derive 
detector parameters from the normal observations and improve the model of the detectors as 
well as the model of the sky. Just such a procedure was used on the FIRAS data to improve the 
sensitivity by a factor of ~ 100 over the initial publication. 
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6. Uncertainties and Correlations 

The algorithm produces a formal estimate of the uncertainties, ty, based on the derivation 
and the estimated uncertainties of the input data, V. The resulting uncertainties are only as good 
as the uncertainty estimates of the original data. Those uncertainties, V, are checked against the 
actual deviations from the model to either give an improved estimate of the input data uncertainty 
or an indication of short comings of the model. 

Identifying the weight matrix (or metric) as the inverse of the covariance matrix, only defers 
the question to how to determine the covariance matrix. There are two sorts of ways to attack this 
problem. The theoretical approach uses a priori knowledge about the system to estimate what the 
noise should be. This includes such things as the Poisson arrival of photons, the Johnson noise of 
the resistors and other known sources of noise. The empirical approach uses the residuals in the 
data itself to make an estimate of the noise. Each approach has its strengths and weaknesses. The 
theoretical approach often underestimates the noise because there are unmodeled noise sources 
present. The empirical approach often overestimates the noise, as it treats parts of the signal 
that are not properly modeled as noise. If both approaches lead to the same estimate one has 
reasonable assurance that the model and estimate are correct. If the approaches differ significantly 
there are either noise sources that are included in the estimate or signal that is not included in the 
model. In this example, we assume that the noise variance V is known. 

The calibration process introduces correlations into the resulting map, C~ 1 B T QBC^ 1 . The 
correlations for a single detector are easily generated by using a unit vector in place of the Y in 
equation (12) and carrying out the calculations as with the data. The process is slightly shorter 
than for the data (checking for convergence is omitted). Obviously this could be repeated for each 
of the detectors and then equation (11) could be used to generate the full covariance matrix. 

There are two problems with this approach. First, the time required is proportional to the 
number of detectors (65536 for IRAC or NICMOS data). Second, the space for the final result is 
r 2 , which is ~ 10 11 , for even the modest WIRE example shown here. Storing and using such a 
large data set is problematical. 

Fortunately the correlations for different detectors are nearly identical (see figure 3). This 
should not be a surprise since the detectors are locked into their relative positions and all move 
together in each dither move. Bad detectors in the array, cosmic rays, and rotations will obviously 
break this symmetry but except for the rotations the effects are minor and rather localized. So 
the correlations can be calculated for typical detectors and the results can be used for the entire 
data set. 
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7. Summary 

As demonstrated with the simulated WIRE data the program can calculate the gains and 
offsets to the theoretical limit on the accuracy if it is given a good model of the data. As shown 
with the NICMOS data the program works reasonably well on real data as well even with he 
normal complexities of real errors and uncertainties. The uncertainties are calculated, and the 
correlations can be calculated with minor changes to the program. These allow the user to 
interpret the result without ad hoc assumptions or guesses about how the errors are related. The 
speed of the program allows modest data sets to be processed in a few minutes and with the 
availability of machines with large memories will allow the large data sets of the future to be 
processed in reasonable times. 
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A. Code Considerations 

This appendix points out several details of implementing this least squares calibration 
procedure in a computer code. The first detail is that a sparse matrix storage and multiplication 
system must be applied. As presented here, the solution (eq. 16) requires construction of the 
matrix T which has dimensions T x V. For a 256 x 256 detector array, T contains at least 
256 4 = 17 x 10 9 elements, making it difficult to store in memory However most of the elements of 
T are 0.0, because a single detector, p, observes no more than M of the T sky pixels. Following the 



-15- 



example presented in §4, we note that T contains the same non-zero elements as B. Furthermore, 
each datum leads to one element in Bq and one element in Bp (eq. [25]). Thus B or T can 
be stored in an array corresponding to the P x M data, and replicated for each of the detector 
parameters (gain, offset, etc.) to be determined. The position within the array indicates the 
p G index of the element, while the a G [1,T] index is stored in a separately constructed 

array. In this way, the storage requirements are reduced by a factor of ~ (V/P)M/Y which is 
generally very large as M <C T for most datasets. 

The second detail is to note that equation (17) is can be implemented as a pair of matrix x 
vector multiplications: T T Z n , followed by T(T T Z n ). This pair of multiplications is much faster 
and requires negligible storage compared to calculating the matrix multiplication TT T first, and 
then (TT T )Z n . The TT T matrix is not nearly as compact as the T matrix. Furthermore, with 
the appropriate juggling of indices both matrix x vector multiplications are performed using the 
stored format of T without explicitly calculating the transpose of T. An example of this is found 
in Press et al. (Chapter 2, 1992). 
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Fig. 1. — The top left image shows one of ten frames of simulated WIRE data. The top center image 
shows the detector gains derived from the data, while the top right image shows the actual gains 
used to generate the simulated data. The lower left graph shows the histogram of the differences 
between the derived and actual gains. The bottom middle and right images show the derived sky 
intensities and the true sky used to generate the simulated data. 

Fig. 2. — The raw data is one of the NICMOS multiaccum frames after fitting linear fits to the 
readouts from each pixel and removing the worst of the cosmic rays. The other pairs of derived 
and reference images are each shown on equivalent scales. The derived gain and offset maps only 
cover the upper 256 x 240 detectors in the 256 x 256 array. 

Fig. 3. — The panels show six columns of the 2P x 2P matrix A T l 2 QA l l 2 for a 256 x 256 detector 
array and an idealized data set collected using a dither pattern consisting of 36 pointings evenly 
spaced along the sides of a Reuleaux triangle. The columns are reformated into 256 x (256*2) arrays. 
From left to right and top to bottom the columns correspond to those containing the correlations 
for G p {p = [128, 128], [16, 128], [16, 16]) and F p (p = [128, 128], [16, 128], [16, 16]). Correlations 
against G p and F p map into the bottom and top half, respectively, of each panel. Black indicates 
strong positive correlations. Displayed ranges for G P G P , F P F P , and G P F P = F P G P correlations 
are [1.5 10~ 3 , 1.55 HT 3 ], [1.2 1(T 3 , 2.0 10~ 3 ], and [-8 10~ 5 , 8 10~ 5 ] respectively. 
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